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Protein data bank entries obtain distinct, reproducible flexibility characteristics determined by 
normal mode analyses of their three dimensional coordinate files. We study the effectiveness and 
sensitivity of this technique by analyzing the results on one class of glycosidases: family 10 xy¬ 
lanases. A conserved tryptophan that appears to affect access to the active site can be in one of 
two conformations according to X-ray crystallographic electron density data. The two alternate 
orientations of this active site tryptophan lead to distinct flexibility spectra, with one orientation 
thwarting the oscillations seen in the other. The particular orientation of this sidechain furthermore 
affects the appearance of the motility of a distant, C terminal region we term the mallet. The mallet 
region is known to separate members of this family of enzymes into two classes. 


I. INTRODUCTION 
A. Protein NMA 

Protein atomic coordinates, determined by a variety of techniques, are deposited in the Protein Data Bank (PDB) 
[T]. The PDB coordinate files permit determination of numerous physical qualities, including charge distribution at 
a given pH, mass distribution, principal axes of rotation as well as the internal symmetry axes that characterize each 
object’s flexibility spectrum. Internal symmetry is determined by normal mode analysis (NMA), a technique first 
applied to protein structures in the 1980s after the first generations of (classical) protein force fields matured [2]-[3]. 
Go [2] and Levitt [5] independently formulated the eigenvector equations in torsional angle space while Brooks and 
Karpins [3] studied the vibrational response in Cartesian space. 

As originally formulated, protein NMA requires an energy minimization to bring the PDB coordinates to a local 
energy minimum along the force field characterizing the object’s energy surface [3]. This necessarily distorts the 
PDB coordinates away from the experimentally determined values and precludes the possibility of a true PDB-NMA. 
Ben-Avraham introduced the use of a (Hookean) potential energy function, one that is a minimum at the outset, to 
model inter-monomer interactions in F-actin so as to compute the atomic-based dispersion spectrum of this polymer 
[6]. We then generalized this formulation to model a protein’s intra-monomer interactions, effectively introducing the 
field of PDB-NMA [7]. As then formulated, PDB-NMA necessarily retains proper molecular topology: no distortions 
of bond lengths or bond angles away from PDB values are allowed due to the use of dihedral degrees of freedom. That 
initial formulation examined solely the packing constraints of nonbonded van der Waal interactions, while current 
work incorporates restoring potentials associated with each dihedral degree of freedom as well [Hiiin]. 

As an alternative to dihedral-based PDB-NMA, various formulations using Cartesian coordinates have been pre¬ 
sented. These are generally referred to as Elastic Network Models (ENM), and include the popular Anisotropic 
Network Models (ANM) and Gaussian Network Models (GNM) [TTJ[T2]. These alternative approaches to dihedral- 
based PDB-NMA rely on structural coarse-graining, such as the use of a single coordinate per residue. While this 
permits analyses of vast molecular assemblies such as ribosomes, topological constraints are necessarily sacrificed. 
(Even when formulated with full-atomic coordinates, i.e. no coarse-graining, the absence of bond-length and bond- 
angle constraints in ENMs will not maintain standard molecular topology.) A recent formulation. Torsional Network 
Model (TNM), makes use of any suitable potential energy function and projects the Cartesian degrees of freedom onto 
dihedral coordinates in order to preserves topology. Recent reviews summarize the relative merits of dihedral-based 
PDB-NMA, ENM and TNM [HE]. 

While the internal symmetry axes of each PDB entry are of interest in and of themselves, as I try to demonstrate here, 
often modal analyses are used to extend or enhance trajectories computed by Molecular Dynamics (MD). Temporal 
trajectories, necessarily non-equilibrium pathways, are not well modeled by NMA for two reasons. NMA is valid only 
near energy minima and assumes a harmonic or linear response to perturbations. Efforts to better model the nonlinear 
aspects of force fields has led to the use of various Principal Orthogonal Decomposition (POD) algorithms, such as 
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Principal Component Analysis (PCA) and Singular Value Decomposition (SVD), where eigenvectors are projected 
out of suitably long-time MD trajectories [Hill]. The degree of overlap between eigenvectors derived from NMA and 
POD is still the basis of investigation [15]. 

Here we continue to explore the usefulness, range of validity as well as sensitivity of dihedral-based PDB-NMA [10] . 
Since no structure-distorting energy minimizations are required, we are able contrast the spectra of structures that 
differ only slightly, such as the two isoforms of the same molecule. To ascertain the significance of any difference, we 
will compare the flexibility characteristic of the same molecule, but solved in a different crystal form and also in the 
presence and absence of bound ligand. We also examine the computed flexibility spectrum of an evolutionary distant 
but structurally similar sample of the same enzyme. 

We find that PDB-NMA provides detailed, precise, rapid, and reproducible predictions of flexibility signatures of 
PDB entries. Our investigation on the flexibility characteristics of one class of enzymes within the Carbohydrate Active 
Enzyme (CAZy) database, the glycoside hydrolase family 10 (GHIO) xylanases, provides a mechanistic rationale for 
the distribution of experimental temperature factors, demonstrates how stability of key residues is maintained in the 
face of thermal perturbation, indicates the degree of homology in the flexibility patterns across enzymes with closely 
similar architecture, and identifies regions with distinct flexibility patterns. These analyses suggest that studying a 
protein’s flexibility characteristics is helpful in order to understand and categorize the unique consequences of the 
overall three dimensional conformation of each PDB entry. 


B. Protein System Studied 

Xylanases hydrolyze xylan, wood sugar polysaccharides of the aldopentose xylose. Unlike planar glucose polysac¬ 
charides, xylan adopts a three-fold, left-handed helical conformation and is often decorated by a variety of branched 
side chains. Xylanases are produced by a wide variety of organisms from bacteria to fungi, protozoa and even gas¬ 
tropods who use xylose as a primary carbon source. As each organism targets different sources of plant hemicelluloses 
and since xylan presents as complex, branched heteropolysaccharides with structures varying between plant species, 
xylanases come with differing sensitivities to structural details of substrate and environmental cues. Various commer¬ 
cial enterprises, such as the paper and pulp industries, wine production and brewing, textile and baking industries to 
name a few, process the hemicelluloses found in plants. This explains the intense interest and the large numbers of 
X-ray crystallographic structures involving xylanases to study their structures and activities under various conditions 
of temperature, pH, alkalinity etc. [T6] 

Xylanases belonging to the GHIO family of glycosidases possess a classic {a/P)s or TIM barrel fold. The central 
barrel, consisting of eight parallel ;d-strands, flares out from a narrower “stability face” at the N-terminal ends to a 
wider “catalytic face” at the G-terminal ends of the ;d-strands m- Xylanase GHIO members display their greatest 
diversity in structure in the positions and loop architecture at the open, catalytic face, where the substrate binding 
groove and active site are situated m- 

Family 10 xylanases hydrolyse the internal (1,4) glycosidic bonds linking xylose moieties using two conserved 
glutamate residues, one acting as a general acid/base and the other as a nucleophile. As one glutamate is situated at 
the G-terminal end of /d-strand 4 and the other at the end of /3-strand 7, these xylanases belong to the 4/7 superfamily of 
TIM barrel folds (also known as clan GH-A). Hydrolysis proceeds via a double displacement mechanism that retains 
the anomeric configuration of the glycosidic oxygen [18]. This reaction scheme involves formation of a covalently 
bound, glycosyl-enzyme intermediate, enabling these xylanases to perform transglycosylation (polymerization) as 
well as hydrolysis (depolymerization) reactions [TOl [20] . 

The substrates of family 10 xylanases bind in deep grooves to enable proper orientation and distortion of the 
sugar moieties to strain the glycosidic bond prior to cleavage. These binding grooves extend away from the catalytic 
glutamate residues, with “pockets” able to accomodate consecutive sugar moieties on both sides of the cleavage site. 
Each “subsite” is labeled: ...-2,-l,l,2,..., with negative subsites situated at the non-reducing, glycone end of the scissile 
bond, and positive sites situated on the reducing, aglycone side. The glycosidic bond between the xylose residues 
bound at subsites -1 and 1 is cleaved. Family 10 xylanases posses no fewer than 3 subsites, and possibly as many as 
6 or 7 [ig. 

Family 10 xylanases possess numerous conserved residues besides the catalytic glutamates, including three tryp¬ 
tophans that form an “aromatic cage” about subsite -1. One tryptophan in particular, situated in the loop after 
/3-strand 8 and present in all family 10 xylanases, is unique, with no equivalent aromatic residue found in any other 
4/7 superfamily of enzymes [21]. Experimental studies demonstrate that mutation or inactivation by oxidation of 
this tryptophan disables enzyme activity in xylanases [22] [23]. An early X-ray crystallographic study of another 
glycosidase, lysozyme, demonstrated that the oxidation of its active site tryptophan inhibits the binding of substrate 
and enzyme activity, and results in a rotation of the indole moiety [24]. Recent X-ray crystallographic studies on 
family 10 xylanases demonstrate that the orientation of the active site tryptophan in these enzymes correlates with 
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the presence or absence of ligand in the active site [21]. We were interested to observe whether PDB-NMA could 
detect the effect of the orientation of this residue in the vibrational signatures of these enzymes. In fact, we observe a 
clear difference, with one orientation of the active site tryptophan leading to a distinct dampening of the oscillatory 
motion. 


II. TECHNIQUE 
A. Method 

To compute eigenspectra and eigenmodes we maintained the same algorithm and parameterization, ATMAN, 
recently introduced for PDB-NMA and described in detail in m- The input data include the atomic coordinates, 
7^, and identities of every entry, i, in the PDB listing. Mainchain amide hydrogens were built in using the program 
reduce [25]. We used standard values for van der Waals radii RvdWii atomic masses, m^, and van der Waals energy 
well-depths, e^, for each atom type (Table |^. 

In addition to these atom-specific parameters, two additional inputs are required: the cut-off distance, D that is 
added to the sum of the van der Waal radii (Rvdw) for nonbonded interactions between atom pairs more than three- 
bond lengths apart, as well as a stiffness constraint, /c, on the sidechain dihedral y bonds. As shown in [10], large 
values of D result in vast numbers of nonbonded interactions (NBI) and lead to eigenspectra that differ substantially 
from those obtained using energy-minimized structures. Rather than obtain the characteristic “soft” responses in the 
eigenspectra frequency range 1 — 250 cm“^, a large value of D in ATMAN yields a very stiff signature extending from 
1 — 40 cm~^. To obtain optimal fits of the eigenspectra, therefore, we maintained a value of D equal to l.oA. To 
eliminate instabilities in the diagonalization of the normal mode equations caused by weakly bound surface sidechains, 
we used a uniform dihedral stiffness constant of 0.1 kcal/mol. Like the atom-specific parameters, both D and k were 
maintained to fixed values throughout these analyses (Table |I|. 


List of fixed ATMAN parameters 


Atom 

RvdW 

mass 

£ 

H 

1.10 

1.008 

0.001 

0 

1.60 

15.999 

0.18479 

N 

1.65 

14.007 

0.41315 

C 

1.85 

12.011 

0.07382 

A, B. G 

1.75 

12.011 

0.03763 

S 

1.85 

32.064 

0.07382 

D 

1.6 A 



k 

0.1 kcal/mol 



T 

180K 




TABLE I: H refers to hydrogen atoms, O to oxygen atoms, N to nitrogen atoms, C to tetrahedral carbons while A, B, and G 
refer to trigonal carbons and S refers to sulfur atoms. The van der Waals radii are given in Angstroms, masses are in daltons, 
and epsilons are in kcal/mol. The atomic data are from [261127j . The cutoff distance, D, and the dihedral bond stiffness 
constant, k, were as in [TO]. All RMSD were computed at 180K. 

In order to reestablish the overall energy scale of the computed eigenspectra one adjustable parameter, C, was 
necessary to analyze each PDB entry. Just as proteins across all classes and sizes obtain predictable mass per 
unit volume measures (density), proteins also obtain universal eigenfrequency spectra with predictable distributions, 
especially of slow modes [28]. We see, for example, both experimentally as well as within the NMA predictions, a 
distinct peak at 25 cm~^ that seems to be due to the inter-packing constraints of secondary elements [TO]. This peak, 
corresponding to vibrations occuring at the 1.3 psec time scale, does not correspond to the slowest motions observed 
by NMA, which occur on the 1 — 5 cm~^ or 30 — 7 psec level, and which seem to be driven by nonlocal NBIs. We use 
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the observation that the number of computed modes with eigenfrequencies under 20 cm~^ corresponds to 14% of all 
modes, to adjust the overall scale factor, C, for the PDB-NMA of each PDB entry. 

To visualize the motion described by each eigenfrequency an amplitude of activation, ai must be selected for each 
mode i. Typically this is achieved using classical conservation of energy considerations: each normal mode obtains 
a time-averaged potential energy of above the value at NS {ks is Boltzmann’s constant and T is the absolute 

temperature). This gives of = 2kBT/w‘f [4]. We chose a value of T = ISOK as experiments indicate that only below 
this temperature protein motility spectra may be modeled as simple harmonic oscillations [29] . 


B. PDB entries analyzed 

According to the CAZy database the X-ray crystallographic structures of 37 GHIO xylanases are currently deposited 
at the PDB. We report here on the PDB-NMAs of 5 of these: IGOKA and IGOKB, IGOM, IIIXA, IGOO and IVBR 
laiEoiiM]. The first 4 entries derive from the thermophilic fungus Thermoascus aurentiacus, while the final entry 
derives from the hyperthermophilic anaerobic bacterium Thermotoga maritima. These PDB entries as GHIO members 
possess high levels of sequence as well as structural similarities. As Table [H] shows, the T. aurentiacus xylanases 
with 2350 non-hydrogen atoms, display RMSDs of approximately O.SA in their three-dimensional conformations. 
For IGOKA and IGOKB, this RMSD is due to the different orientation of a small number of sidechains, while 
the RMSDs between the different crystal structures, IGOK, IGOM, IIIX, and IGOO, are due to slight structural 
differences distributed across the entire molecule. Even the structure from the hyperthermophilic bacterium, IVBR, 
with 2700 non-hydrogen atoms, which obtains a sequence similarity score of 35% to IGOKA, obtains a RMSD fit of 
l.OOA and scores a structural alignment score of 92% to IGOKA [32], emphasizing the structural similarity of these 
xylanases across species. Of the 5 PDB entries studied, the IGOO computations included a ligand, glycerol, in the 
active site pocket at subsite -1. 


PDB ID 

IGOKA 

IGOKB 

IGOM 

IIIXA 

IGOO 

IVBRA 

IGOKA 

302 

0.43 

0.41 

0.48 

0.38 

1.06 

IGOKB 

100 

302 

0.60 

0.67 

0.43 

1.07 

IGOM 

100 

100 

302 

0.48 

0.43 

1.07 

IIIXA 

98 

98 

98 

302 

0.60 

1.22 

IGOO 

100 

100 

100 

98 

302 

1.07 

IVBRA 

35 

35 

35 

36 

35 

324 


TABLE II: Similarities of the 5 PDB entries analyzed. RMSD values of all-atom matches recorded above the diagonal. For 
IVBRA RMSD values represent best fits after pairwise sequence matching that eliminates unmatched residues. Percentage 
of primary sequence identities recorded below the diagonal. Diagonal entries indicate the number of amino acid residues per 
entry. RMSD values provided by PyMol. 

We first examine the flexibility signatures of PDB entry IGOK in both its A and B isoforms. The structure of this 
302 amino acid, 2350 nonhydrogen-atom, polymer was determined to 1.14A resolution by Lo Leggio and coworkers 
[21]. Figure [^presents its structure in a ribbon representation, demonstrating the classic {a/P)s fold of this enzyme. 
Looking at the structure from the catalytic face such that the N and G terminal regions meet near the top of the 
figure, places the ligand binding groove roughly along the horizontal axis. This perspective places the catalytic Gin 
131 below and Gin 237 above the axis. A second axis running nearly perpendicular to the first becomes apparent in 
animations: this vertical axis runs from the top, between the noncovalently bound N and G terminal regions, between 
/3-strands 1 and 8 and /3-strands 3 and 4 to the bottom-most residue. Asp 100, at the start of the 3rd pa helix. The 
point of intersection of these two axes in Fig. marks the location of subsite -2. 

The electron density of the IGOK data revealed two approximately equal populations (A and B) in which 11 
residues could adopt one of two conformations m- Their all-atom RMSD is 0.43A. Upon inspection, it is seen that 
most of this difference is due to three residues adopting distinct conformations (Fig. [^. All three residues, Trp 275 
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FIG. 1: Ribbon representation of IGOK prepared using PyMol. The eight parallel /3-strands of the TIM barrel open at 
the catalytic face where the catalytic glutamate sidechains are drawn as sticks. The binding groove runs roughly along the 
horizontal axis and separates the upper and lower domain motion evident in modes 1 and 2. The nearly vertical axis separates 
two vertical domains that likewise oppose eachother in modes 1 and 2. The G and N terminal regions meet at the top, the 
“mallet” region is highlighted in the upper right quadrant, while the three short helices composing “chin” region are highlighted 
in the lower two quadrants. Gatalytic Gin 131 falls below the horizontal axis, while Gin 237 is above this axis. 


(light green), its neighbor Arg 276 (dark green) as well as Gin 46 (red), line the binding groove about subsites -1 
and -2, and are strictly conserved among GHIO members. The remainder of the binding groove residues, including 
the catalytic glutamates (magenta) as well as two arginines, 47 and 175 (yellow) known to promote enzyme-ligand 
association, are identically situated in the two isoforms. Indeed, excluding one sidechain, Trp 275, in calculating the 
all-atom RMSD between isoforms A and B reduces the fit from 0.43 to 0.27A. If the side chains of all three residues, 
Trp 275, Arg 276 and Gin 46 are excluded in the RMSD computation, the overall fit further improves to 0.15A. 

In the IGOK structures, Trp 275 in particular obtains an interesting position: at the end of the eighth /3a-loop of 
the TIM barrel, it projects down and over the subsite -1 hollow. The image of a lid comes to mind, an image that is 
re-inforced by the fact that Trp 275B seems to correspond to a “shut” and Trp 275A to an “open” orientation of the 
lid. The apparent lid-like opening and closing that Trp 275 effects is mediated by substantial shifts in the side chain 
dihedral values between isoforms A and B: (^xi^X 2 )= (70°, —61°) in A and (—75°, 38°) in B. The effect is dramatic: 
atom CZ2 of Trp 275, for example, sweeps out an arc of nearly SA from the A to B conformation. To accomodate 
such substantial local fluctuation, the neighboring Arg 276 side chain likewise shifts to a substantial degree: its CZ 
atom shifts by 5A from B to A confomers. And indeed, as pointed out by Lo Leggio and coworkers, Trp 275 in its 
A conformation sterically clashes with Arg 276 in its B conformation, making their assignments to population A or 
B unambiguous [21]. Intriguingly, no closed B conformations can be detected for Trp 275 or Arg 276 in structures 
solved with any type of ligand present in the binding groove. 

We were interested to observe whether a PDB-NMA could detect differences in eigenspectra and concomitant 
eigenmotions due to the small but pertinent shifts of 3 residues out of 302 in this high resolution structure. As we 
will show, we find a clear difference in the mobility spectra of IGOK A and B: the closed, B conformation dampens 
the oscillatory behavior seen in the A conformation. To assess possible relevance of this difference, we next compute 
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IGOKB 


N/C 


IGOKA 



FIG. 2: Surface representations of IGOK A (left), with the “open” orientation of Phe 275 and Arg 276 in green, and of IGOK 
B, the “closed” conformation. The catalytic glutamates are colored magenta and Glu 46 in red. Arg 47 and Arg 175, colored 
yellow, are known to be involved in the primary binding interactions with substrates at subsites -2 and 2. 


the flexibility spectra of T. aurentiacus xylanases IGOM and IIIX that also differ at the level of RMSD of O.SA, 
but due to delocalized structural differences. Again we observe clear and seemingly pertinent trends supporting our 
observations regarding the B conformation. We then assess whether the effects observed with Trp 275 and Arg 276 
in their B form is emulated by the presence of a ligand in subsite -1 by computing the eigenmodes of IGOO. We find 
that the presence of glycerol in the active site cleft does not effect the same dampening as in the B conformations of 
Trp 275 and Arg 276. Finally, we will compare the similarities in the flexibilities of fungal (IGOK) versus bacterial 
(IVBR) xylanases which further emphasize the possible relevance of observed trends. 


III. RESULTS 

Table Hill shows results of the dihedral-based PDB-NMA on the PDB entries. IGOKA and IGOKB obtain 9413 
vs. 9390 nonbonded interactions, with the closed, B, conformation losing 23 NBI due to loss of alignment of Trp 275 
and Arg 276 with the 8th and loops of the TIM barrel as it swings shut over the subsite -1 cavity. This 0.25% 

decrease in the number of NBI results in an increase of nearly 9% from the slowest mode frequency of 2.77 cm~^ for 
IGOKA to 3.02 cm~^ for IGOKB. Activating the NS according to each eigenvector results in RMSDs of 0.38A and 
O. 37 A for IGOKA and B; nearly identical but slightly bigger for the “softer” open A isomer, despite the smaller C 
value. The surprise, however, is in the concomitant distribution and largest deviation from NS for all Cq,. In Fig. 
we plot the RMSD from the NS of each Ca due to thermal activation of the slowest mode for IGOKA (blue) and B 
(orange). The curves closely overlap, as anticipated for two such similar structures, with a notable exception in the 
C terminal domain. The curves in Fig. match closely until Trp 267, when the open A confomer suddenly obtains 
decidedly larger RMS values than those of the B isomer until Tyr 294 when the RMS values again match until the C 
terminal residue. The transition point, 267, occurs at the C terminal end of the eighth ^d-strand (residues 261-266) 
and belongs to one of the conserved tryptophan residues lining subsite -1. This differently mobile region extending 
from Trp 267 to Tyr 294 includes the lid residues Trp 275 and Arg 276, and is highlighted in the upper right quadrant 
in the ribbon representation of Fig. We will refer to this region as the “mallet”. Interestingly, this particular 
loop structure, the mallet, seems to divide the members of GHIO xylanases into two subsets: molecules with (subset 
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2) or without (subset 1, like IGOK) an additional a-helical stretch immediately after the conserved Trp-275 or its 
equivalent [21]. Counterintuitively, but probably due to the loss of mobility of the B confomer at the C terminal 
region, this confomer, with slightly fewer NBI and a stiffer C, obtains a decidedly larger maximum deviation, at 
residue 100, than the closed confomer: 3.72A for B vs. 3 . 23 A for A, a 15% shift. Asp 100 and the short helix-loop 
preceding it (92-99) as well as the two neighboring short a-helical regions (50-58 and 142-149) are highlighted in the 
lower half of Fig. and will be referred to as the “chin” region. 


PDB 

entry 

NBI 

(NBI/ 
atom) 

C 

Eigenmode Number 

1 

2 

c/' 

RMSD 

ARmax 

c/’ 

RMSD 

ARmax 

1GOKA 

9413 

(3.62) 

114 

2.77 

0.38 

3.23 

3.20 

0.34 

2.65 

1GOKB 

9390 

(3.61) 

117 

3.02 

0.37 

3.72 

3.39 

0.32 

2.55 

1GOM 

9457 

(3.64) 

105 

2.87 

0.37 

3.20 

3.38 

0.34 

2.83 

1I1XA 

9340 

(3.60) 

127 

2.74 

0.39 

3.23 

3.38 

0.32 

2.83 

1GOO 

9619 

(3.70) 

85 

3.10 

0.36 

3.10 

3.51 

0.32 

2.16 

1VBR 

10,407 

(3.53) 

98 

2.49 

0.38 

3.54 

2.85 

0.34 

2.95 


TABLE III: Results of dihedral-based PDB-NMA on PDB entries. NBI gives the numbers of nonbonded interactions included 
for each computation, NBI/atom gives the average numbers of interaction per atom, and the constant C gives the overall scale 
factor. The frequencies / are given in units of cm~^, the RMSD are computed for 180K over all atoms, and the maximum 
deviation of any atom from NS is given, in A, under ARmax- 

We studied the mobility patterns via 3d animations to better understand the nature of the computed flexibilities. 
The slowest modes of IGOK A are presented as GIF animations (using LICEcap) in surface representations (prepared 
with PyMol) with the catalytic glutamates in magenta and the subsite -1 tryptophan triad in green in the folders 
MODEl and MODE2 at [33| as well as a pair of stationary images in Fig. The slowest mode of IGOK presents 
as a “chomping” motion, Pacman-style, of the C and N terminal region above the horizontal axis of Fig. relative 
to the region below this axis. The motion at first glance appears very similar for the IGOK A and B isoforms, with 
a remarkable plasticity surrounding a pronouncedly stable core at the active site. The catalytic glutamates that 
remain at a steadfast 5.5A separation, for example, seem enabled by a design scheme that deflects innate motility to 
other portions of the molecule. Rather than a motility scheme that compartmentalizes the motions into blocks with 
tidy divisions between “fixed” stability faces and /3-barrels and highly flexible mobility faces, the “shock-absorption” 
is distributed throughout the molecule, with portions adjacent to the stability points deflecting any propensity to 
distortions to other regions. The net effect is remarkable stability for select nonlocal, nonbonded interactions in the 
face of “indiscriminate” thermal agitation. 

Further inspection of the 3d animations also reveals, like the RMS plots, that isoform A obtains a more pronounced 
swinging of the mallet region than isoform B. Studying the motility pattern, it becomes clear that several features 
permit this mallet region to obtain such large yet stable RMS deviations. The residues immediately preceding the 
mobile region starting at Trp 267 belong to /3-strand 8 (residues 261-266) which is firmly embedded and anchored by 
the /3-barrel construction. The loop immediately preceding this /3-strand, linking ^d-strand 8 to a-helix 7 (residues 
244 to 258), is further stabilized by a disulfide bridge linking the N terminal residue of /3-strand 8, Gys 261, to 
the N terminal region of a-helix 7 at Gys 255. The residues beyond the mobile region ending at Tyr 294 form a 
G-terminal a-helix (residues 291-302) that is anchored between the N-terminal a-helix (residues 5-13) on one side and 
the neighboring a-helix 7 on the other side, and which is itself stabilized, as mentioned, by the disulfide bridge between 
residues 255 and 261. Together, these features “ground” the mobile 267-294 region, maintaining overall structural 
integrity even as this region itself executes sizeable RMS excursions. It is precisely this motion of the mallet which is 
hampered and dampened in isoform B. 

The chin region, meanwhile, having the largest moment arm, sweeps out the largest arc and obtains the largest 
RMSD in both conformers. Inspecting the disposition of the residues of this region (Fig. [^, we note that the short 
(a-helix (residues 92-97) preceding the highly mobile Asp 100 might move en masse with the neighboring short a-helix 
at 142-149 at the end of /3-strand 4. In fact this does not happen in IGOK as we see a clear tendency of these two 




















FIG. 3: RMSD for each Ca due to mode 1 for IGOKA (blue) and IGOKB (orange). The average B factors for each residue, 
scaled to 10% the experimental values, are indicated by the dotted line. 



FIG. 4: RMSD for each Ca due to mode 1 for IGOKA (blue), IGOM (green) and IIIXA (black) 


regions to pull apart. The mobile 92-100 chin region rather moves in tandem with the short a-helix-loop above it, 
residues 50-58. The reason seems to be due to better stacking interactions between these two regions, including two 
indole rings. Trp 51 extends down from the upper, vertical helix of the chin, while Trp 94 is oriented upwards from 
the mobile 92-99 helix. This propensity of the /3-strands 3 and 4 loops to pull away from each other mirrors that 
tendency between the C and N terminal domains, and creates the impression of a dynamic vertical axis, with residues 
on one side of the axis tending to pull away from residues on the other side. The point of intersection of these two 
axes is interesting, as residues Gin 46 and Gin 47, important ligand recognition and/or binding residues belonging to 
subsite -2, lie on one side of this divide, while Trp 275 and the other residues lining subsite -1 lie on the other side 
of this partition. This could result in a possible tension developing along an oligosaccharide spanning subsites -2 to 
+2, for example. In the motility pattern of IGOKB, there is a decreased chomping over the horizontal axis, and an 
increased separation between the vertical domains due to the alignment of the indole ring of Trp 275 against Gin 46 
and Gin 47. The net effect is a frustration of the chomping mode in IGOKB and an increased propensity to distort 
the shape of subsite -2. 

Interestingly, the experimentally observed B-factors of IGOK, plotted as the dotted curve in Fig.[^ seem to suggest 
a large contribution from IGOKA-type motility, with two peaks in the 267-294 region similar to those observed in 
the RMS plot of mode 1 of IGOKA. 
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FIG. 5: Stationary images demonstrating the effects of modes 1 on IGOKA (left) and IGOKB (right). The images, prepared 
by YASARA, are in the orientation adopted in Figures 1 and 2 and show 5 different amplitudes of activation of mode 1 on 
the NS: ai sin(^), with n=0,...,4. The color scheme is by secondary structure, with sheets red, helices dark blue, turns green 
and coils light blue. Regions that remain stationary have closely overlapping elements, as for the 8 red /3 strands. Regions 
displaying greater motility present separated elements, especially noticeable in the mobile 267-294 region, the mallet, in the 
top center as green and light blue of each image. The mallet on the left, of the open IGOKA, is oscillating far more than 
the one on the right, corresponding to the closed IGOKB. These signihcantly differing signatures are due to the alternate 
orientations of merely 25 non-hydrogen atoms of Phe-275 and Arg-276, out of 2350 non-hydrogen atoms total, in the analysis. 
This presentation also provides a sense of the overall harmonicity of the motion, with the central NS structure straddled by 
the positive and negative amplitudes of activation of mode 1. 


To test whether these differences in the presentation of mode 1 of IGOKA and IGOKB are significant, we compared 
their modes to those computed of two other molecules. One, IGOM, is of the same protein but crystallized to a different 
crystal form (form I rather than II of IGOK) and solved by Lo Leggio and coworkers to 1.94A resolution [21]. IGOM 
obtains a single conformation, equivalent to IGOKA, and has an all-atom RMSD of 0.4lA compared to IGOKA 
distributed equally over all residues: no single residue mismatches exist, except for a slight re-orientiation of Trp 275. 
The other molecule we examined, IIIX, is also of xylanase derived from a T. auriantacus strain, from Indian soil, and 
has a 297 out of 302 sequence-identity to IGOK and IGOM. The structure IIIX was solved by Natesh and coworkers 
to l.llA resolution and obtained two confomers, A and B [30]. IIIX crystallized to crystal form I, like IGOM. The 
A and B confomers of IIIX differ in the locations of 23 surface residues: all active site residues and ligand binding 
groove residues are resolved in one conformation, equivalent to the A confomer of IGOK. We arbitrarily used IIIX A 
for the current analysis, which obtained an all-atom RMSD fit to IGOKA of 0.53A, and to IGOM of 0.48A (Table I^. 

The number of NBI and C values used for computing the modes of these two proteins are provided in Table III 
The frequency of the slowest mode of IGOM is 2.87 cm~^ and of IIIX is 2.74 cm~^, similar to the value obtained 
for IGOKA (2.77 cm~^) whose open Trp 275 structure they resemble. The RMSD due to thermal activation of each 
Ca atom in IGOM is plotted in Fig. (green), along with that of IGOKA (blue). The curves demonstrate very 
similar mobility patterns, except for mismatches at residues 1-19, 140-168, and for the amplitudes for several peaks. 
Differences in these two curves may be due to the different crystal packing interactions or to the different resolutions 
of each model. We can ascertain which factor predominates by comparing the RMSD plot due to thermal activation 
of IGOM to that of the other crystal form I structure, IIIXA (black). Here we see that the Ca deviations obtained 
by mode 1 of IGOM are also obtained by IIIXA, even though those structures differ by 0.48A. The match includes 
the 140-168 region, the magnitudes of the peaks, and a tendency for the displacements of residues 1-19 toward those 
of IGOM. In short, these curves demonstrate that in this case crystal packing effects more strongly correlate with 
thermal RMSD of the slowest mode than with resolution or with the precise details of the atomic orientations. 

To further explore the possibility that Trp 275, by adopting the closed IGOKB orientation, acts as a proverbial 
monkey wrench to block and frustrate the swing of the mallet region, we wondered whether another agent, namely 
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the presence of a ligand in subsite -1, in any way mimics the effects of this tryptophan. Lo Leggio and coworkers 
solved the structure of the crystal form II enzyme cryocooled to lOOK [21] . A molecule of the cryoprotectant, glycerol, 
a mimic for xylose, was found at subsite -1, while Trp 275 and Asn 276 adopted the open conformation analagous 
to IGOKA. How does the presence of this ligand affect the slowest mode? The PDB-NMA of IGOO, including the 
additional 6 rigid-body degrees of freedom for the glycerol molecule, is unambiguous: the presence of glycerol in 
subsite -1 does not inhibit the “swings” of the mallet region in the way that Trp 275B does. The RMSD plot of each 
Coi due to thermal activation of mode 1 (data not shown) shows that IGOO obtains a motility spectrum much like 
IGOM, with whom it shares an RMS fit of 0.5lA, including an active mallet region and a mobile N terminal domain. 
The presence of a small ligand in subsite -1 does not eliminate or frustrate the motion of the mallet region the way 
that Trp 275B does in IGOK. 

Finally, we examine the flexibility spectrum of a xylanase with an extra a-helix after the Trp 275 loop, and therefore 
belonging to subset 2 of xylanases from GHIO: IVBR. The hyperthermophilic T. maritima xylanase lOB structure, 
IVBR, was solved by Ihsanawati and coworkers to 1.8A resolution as a dimer m- We used one copy from the dimer 
for our analysis. IVBR obtains 324 residues and is therefore 22 residues longer than the other PDB entries analyzed. 
Superposing images of IVBR and IGOK reveals added residues in the loops after ^d-strands 8 (the Trp 275-equivalent 
loop) and 7, and at the end of the G terminal a helix, lengthening that helix by 5 residues relative to IGOK. Aside 
from these regions, the structural overlap between these two structures is excellent: IGOK achieves a structural match 
of 98% to that of IVBR, while their primary sequences obtain 35% identity (Table [h]). 

The resulting slowest mode bears a striking resemblance, and is essentially identical, to that of IGOKA, demon¬ 
strated by the stereo images available in the folder IVBR -|- IGOKmODEI at [33]. In this pair of GIF animations, 
green traces the mainchain of IGOKA and cyan traces that of IVBR. The sequence is presented in the customary 
frontal view equivalent to the orientation in Fig. as well as from the side after rotation of 90° about a vertical axis. 

The molecules move in tandem throughout the chain and throughout the sequence, and also obtain equivalent overall 
magnitudes of deformation; no additional parameterization was imposed to require comparable deviations. Glearly 
the architecture of the molecular chain selects for definite flexibility characteristics, irrespective of the particular 
primary sequence. These sequences also resolve a puzzle observed with subset 2 Family 10 xylanases: why the Trp 
275- equivalent residues (802 in IVBR) obtain relatively small temperature factors compared to their values among 
the subset 1 members. The active site Trps are not more disordered in subset 1 enzymes compared to those in subset 
2, they are less mobile than the extended mallet region of these structures. In IVBR, the mallet region once again 
starts at the Trp located at the G terminal end of ;d-strand 8 (residue 794) and extends to Tyr 826 in the G terminal 
(a-helix, but then obtains a further contribution from residues inserted in the loop sequence after /3-strand 7: Arg 
757-Gln 771. This latter region seem to “bulk up” one end of the mallet while the added residues in the G terminal 
helix seem to do the same for the obverse end of the mallet. It is interesting to note that the stabilizing disulfide 
bridge of IGOK, located toward the stability face of that molecule, has moved in IVBR. A cysteine located at the 
N-terminal end of a-helix 8 (residue 825), is situated to potentially form a disulfide bridge with Gys 775 in the middle 
of a-helix 7. In IVBR however, the distance of separation of their respective SG atoms precludes the presence of this 
bond. In view of the fact that this molecule derives from a hyperthermophilic organism that thrives in 80G conditions, 
the absence of disulfide bridges is surprising. 


IV. DISCUSSION 

A number of interesting observations ensue from the study of the PDB-NMA signatures of subset 1 and 2 GHIO 
xylanases. Molecules that share a high degree of structural homology display equivalent flexibility characteristics, as 
seen in the slowest modes of IGOKA, IGOM, IGOO, and IVBR. The slowest mode pertains to a chomping motion 
and the second slowest mode pertains to a grinding motion across the ligand binding groove [101133]. Plots of RMS 
deviations from NS due to thermal activation as well as 3d animations of eigenvectors demonstrate that select residues 
remain largely immobile while other portions of the molecule display sizeable excursions about their NS. The OE2 
atoms of the catalytic residues Gin 237 and Gin 131, for example, have a distance of separation of 5.57A in the NS. 
Thermal activation of mode 1 in the current protocol suggests this distance may increase by 0.04A or diminish by 
0.11 A. In contrast, the neighboring subsite -1 residues Trp 275 and Trp 87 behave very differently. GE3 of Trp 275 is 
a distance of 8.45A from GH3 of Trp 87 in the NS, and approach as close as 7.29A and separate as far as 12.26A, a 
net difference of nearly 5A compared to a net difference of 0.15A for the neighboring catalytic glutamates. It needs 
to be stressed that this criterion is not in any manner built into or required by the computation. This feature is an 
expression of packing constraints that uniquely define the internal symmetry axes of the molecule. The surprising 
immobility of key, nonbonded interactions permitted by the overall molecular design and resultant intra-molecular 
packing, may indeed have driven the selection of protein design. 

While select NBI remain relatively immobile during thermal agitation, other regions display concerted movement. 
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such as the ligand binding groove and the C terminal mallet region. The substrate binding groove especially experiences 
considerable realignment. The slowest mode demonstrates an enhanced likelihood for the binding groove to be more 
and less solvent exposed as the N/C terminal, upper domain and lower domain close across the binding groove with 
a characteristic period of around 12 psec. In addition, the slowest mode reveals an interesting region that we refer 
to as the mallet, a region after /3-strand 8 that obtains RMS shifts larger than other parts of the C/N domain. This 
particularly mobile region happens to segregate the GHIO members into two subsets depending on the nature of 
their loop design. Surprisingly, the mallet’s motility is suppressed by the realignment of the indole ring of a single 
tryptophan, a realignment observed in the A and B confomers of the same crystal structure, IGOK. What might be 
the consequences of such a mobile mallet region? A more thorough examination of the known GHIO structures might 
reveal connections between the size and mobility of this region and the nature of ligand bound, such as the degrees 
of polymerization and decoration for example. 

And how to interpret the different Trp 275 orientations in light of these analyses? Experimental results demonstrate 
xylanase inactivation by oxidation of the active site tryptophan [34]. Furthermore, as pointed out, the X-ray crystal¬ 
lographic structure of the TIM-barrel GH enzyme, lysozyme, revealed that the oxidation and inactivation of its active 
site tryptophan resulted in a concomitant rotation of the indole moiety. This suggests a possibility that the IGOKB 
structure represents an oxidized, inactive form of the enzyme. However, a different interpretation might be provided 
by the realignment of the active site tryptophan seen in lipases. Lipase activity is dependent on the orientation of 
a tryptophan residue in a surface loop, called the lid, being situated to either sterically hinder substrate access to 
the active site or to permit access to the active site [35|. A comparison of modes between these two enzymes might 
therefore prove of interest. 


V. CONCLUSION 

PDB-NMA provides a direct, rapid and reproducible means to sensitively probe the flexibility signatures of PDB 
entries. The realignment of one or two residues out of 300 is sufficient to create distinct mobility spectra, and likely 
have direct biochemical consequences in terms of the activity or inactivity of enzymes, for example. On the other 
hand, PDB entries with a large degree of structural homology display very similar flexibility characteristics, even in 
the absence of a high degree of primary sequence homology. 

We have examined primarily the slowest mode of several PDB entries. The shape of the slowest mode is principally 
due to the collective effect of thousands of nonbonded interactions and is therefore least sensitive to details in the 
potential energy formulation used to construct the eigenvalue equation. Higher frequencies probe greater details in the 
atomic potential and therefore require greater care in their formulation and computation. We here showed that the 
slowest eigenmode provides valuable insights into the unique flexibility characteristics of a particular molecular design, 
the TIM barrel fold in family 10 xylanases. We demonstated that select NBI display little to no deformation under 
thermal agitation while other regions obtain large deformations. Observed flexibility patterns highlight regions that 
move en masse and bring into focus features removed from the active site that may nonetheless affect enzyme function. 
Efforts to correlate flexibility patterns and their timescales to enzyme functionality may enhance understanding of 
protein design. 
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